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5.  INTRODUCTION 

The  purpose  of  this  project  is  to  develop  a  database  of  digital  mammograms  that  can  be 
used  by  researchers  who  (1)  are  trying  to  determine  the  image  quality  requirements  of  detectors  for 
digital  mammography;  (2)  are  developing  image  processing  techniques  to  optimize  the  displayed 
digital  mammogram;  (3)  are  developing  computerized  methods  for  analyzing  mammograms;  (4)  are 
studying  the  effects  of  image  compression  methods  on  image  quality;  (5)  are  developing  methods 
for  remote  transmission  of  mammograms;  and  (6)  are  studying  the  relationship  between  image 
quality  and  diagnostic  accuracy.  This  database  also  could  be  used  as  a  resource  for  teaching 
radiology  residents  and  for  testing  the  performance  levels  of  mammographers. 

The  specific  aims  of  this  proposal  are: 

1.  Collect  and  digitize  200  cases  in  each  of  5  different  categories,  mammograms  exhibiting:  (i) 
clustered  microcdcifications,  (ii)  masses,  (iii)  architectural  distortions,  (iv)  asymmetric  densities,  and 
(v)  no  lesions  (i.e.  normals). 

2.  Make  these  cases  available  to  other  researchers  either  over  computer  network  (Internet)  or  by 
sending  images  on  computer  tape  or  CD.  The  database  will  be  distributed  as  widely  as  possible  so 
that  comparisons  of  different  computerized  analysis  techniques  can  be  standardized. 

6.  BODY 


This  research  is  being  funded  as  an  infrastructure  award  and  as  such  it  does  not  represent  a 
research  project  per  se.  That  is,  there  is  no  hypothesis  that  we  are  trying  to  prove.  Therefore,  this 
report  is  structured  slightly  different  than  a  normal  scientific  research  report  —  heavy  on  the  method 
and  light  on  actual  results.  In  this  project,  the  procedure  is  the  most  important  component,  which  is 
applied  continuously  in  a  straightforward  manner  to  achieve  the  goal  of  creating  the  database  of 
mammograms. 


Taskl:  Collect  and  digitize  mammograms,  (See  Figure  1.) 

Until  recently  (July),  the  mammography  section  had  been  severely  understaffed  and  without 
a  section  chief.  Two  new  radiologists  have  been  hired,  but  they  have  been  preoccupied  with  getting 
the  clinical  side  of  the  section  running  smoothly  again.  Therefore,  it  has  been  difficult  to  add  more 
cases  to  the  database.  Specifically,  we  need  a  radiologist  to  review  the  case  to  ensure  that  it  meets 
the  requirements  for  inclusion  in  the  database,  review  the  pathology  report,  and  mark  the  location  of 
the  lesion(s).  If  one  of  the  two  new  radiologists  cannot  devote  some  time  to  the  project,  then  I  will 
enlist  Carl  Vybomy,  who  is  a  radiologist  with  expertise  in  mammography,  to  help  out  on  the 
project.  It  is  necessary  to  add  him  to  the  IRB  protocol,  before  he  can  begin.  This  is  being  done 
presently. 

We  now  have  611  cases  digitized  (see  Table  I,  at  the  end  of  the  report).  We  performed  a 
per-  release  of  50  cases  of  clustered  microcalcifications  this  year  to  test  the  database.  It  was  found 
that  the  truth  data  needed  to  be  specified  more  exactly.  Initially  we  had  just  indicated  the  center  of 
the  lesion,  but  not  its  extent.  We  are  now  specifying  the  four  comers  of  a  rectangle  that  completely 
encloses  the  cluster  of  calcifications.  Currently,  because  we  are  still  short  staffed  in  the  radiology 
department,  it  has  been  difficult  to  get  a  radiologist  to  mark  the  tmth.  We  did  hire  a  consultant 
radiologist  for  a  2-month  period,  but  she  is  currently  unavailable.  Once  the  tmth  has  been  recorded, 
we  will  release  the  50  cases.  We  will  follow  this  initial  release  with  an  additional  50  cases  of 
microcalcifications  and  100  cases  of  masses.  As  more  cases  accme,  further  releases  will  be  made  in 
batches  of  50  or  100.  We  will  accelerate  our  case  accmal  to  meet  our  goal  of  1000  cases. 


Figure  1 .  A  flowchart  of  the  steps  required  to  collect,  digitize,  archive,  and  distribute 
the  mairunographic  database.  The  'Full  Image'  is  the  whole  digitized  mammogram  at  full 
resolution.  The  'Reduced  Image'  is  a  minified  version  (reduced  resolution)  of  the  full 
image.  The  'ROI  Image'  is  a  portion  of  the  full  image  at  full  resolution. 
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It  is  important  that  this  database  be  used  in  such  a  way  so  as  comparisons  between  different 
algorithms  can  be  made.  This  is  the  main  motivation  for  creating  such  a  database.  To  allow  for 
valid  comparisons  to  be  made,  two  things  are  needed;  (i)  the  exact  same  cases  need  to  be  used  to 
measure  performance,  and  (ii)  the  same  criteria  for  scoring  the  results  need  to  be  used.  To  this  end, 
we  have  divided  the  database  into  testing  and  training  cases.  Furthermore,  we  are  preparing 
recommendations  for  scoring  the  results  based  on  some  preliminary  results  in  our  lab.  [1]  This 
information  will  be  released  with  the  database.  No  other  database  offers  such  instructions  for  its 
use  and  thus  comparisons  of  different  techniques  is  still  difficult  to  do. 

Task  2:  Establish  protocol  for  transmitting  database 

We  originally  considered  the  ACR/NEMA  (DICOM)  image  format  for  our  database. 
However,  when  we  began  our  work,  the  ACR/NEMA  format  did  not  have  a  module  for 
mammography,  and  it  would  have  been  an  extensive  project  to  develop  one  at  that  time.  Recently  a 
digital  mammography  module  has  been  approved.  We  will  consider  whether  it  is  appropriate  to  use 
this  file  format,  since  many  users  of  our  database  may  not  have  a  DICOM  reader. 


Task  3:  Maintain  database  and  distribute  cases 

Maintenance  of  the  database  and  distribution  of  the  database  are  at  a  minimum  currently. 
These  tasks  will  become  important  shortly  as  cases  go  “on-line”.  Cases  are  being  archived  on  4 
mm  tape  and  DVD. 


7.  KEY  RESEARCH  ACCOMPLISHMENTS 
•  Collection  of  61 1  mammographic  cases 

8.  REPORTABLE  OUTCOMES 


Presentations  and  Manuscripts: 

1 .  Nishikawa  RM,  Wolverton  DE,  Schmidt  RE,  Johnson  RE,  Pisano,  ED,  Hemminger  BM:  A 
common  database  of  mammograms  for  research  in  digital  mammography.  U.S.  Army  Medical 
Research  and  Materiel  Command  Breast  Cancer  Research  Program:  An  Era  of  Hope, 
November,  1997,  Washington,  DC. 

2.  Nishikawa  RM,  Wolverton  DE,  Schmidt  RA,  Pisano  ED,  Hemminger  BM,  Moody  J:  A 
common  database  of  mammograms  for  research  in  digital  mammography,.  In:  Doi  K,  Giger 
ML,  Nishikawa  RM,  and  Schmidt  RA  ('eds.l.  Digital  Mammography  ‘96.  (Amsterdam; 
Elsevier  Science)  435-438, 1996. 

3.  Nishikawa  RM:  Mammographic  databases.  Breast  Disease  10  137-150, 1998. 


9.  CONCLUSIONS 


The  development  of  a  common  database  of  mammograms  for  digital  mammography 
research  is  underway.  We  performed  a  prerelease  of  50  cases  of  clustered  microcalcifications  to 
test  the  database.  It  was  found  that  the  truth  data  needed  to  be  specified  more  exactly.  Accural  of 
cases  has  been  hampered  because  of  short  staffing  in  our  mammography  section.  If  necessary,  we 
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will  hire  an  outside  radiologist,  Carl  Vybomy,  to  assist  us.  We  will  follow  the  initial  release  with  an 
additional  50  cases  of  microcalcifications  and  100  cases  of  masses.  As  more  cases  accrue,  further 
releases  will  be  made.  Cases  will  be  released  with  instructions  that  will  allow  all  users  of  the 
database  to  compare  their  results  in  a  meaningful  manner.  Such  comparisons  are  currently  not 
possible  or  are  problematic  with  any  existing  database. 

10.  REFERENCES 
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Table  1.  Breakdown  of  cases  in  the  database  as  of  October  1/99. 


Type  of  Lesion 

Pathology 

#  of  Cases 

Mass 

Malignant 

116 

Mass 

Benign 

75 

Microcalcifications 

Malignant 

115 

Microcalcifications 

Benign 

87 

Asymmetric  Density 

Malignant 

18 

Asymmetric  Density 

Benign 

4 

Architectural 

Distortion 

Malignant 

25 

